Annotation inconsistencies beyond sequence similarity-based function prediction – phylogeny and genome structure
نویسندگان
چکیده
The function annotation process in computational biology has increasingly shifted from the traditional characterization of individual biochemical roles of protein molecules to the system-wide detection of entire metabolic pathways and genomic structures. The so-called genome-aware methods broaden misannotation inconsistencies in genome sequences beyond protein function assignments, encompassing phylogenetic anomalies and artifactual genomic regions. We outline three categories of error propagation in databases by providing striking examples - at various levels of appreciation by the community from traditional to emerging, thus raising awareness for future solutions.
منابع مشابه
G-protein coupled receptor subfamily identification using phylogenetic comparison of gene and species trees
Most approaches to prediction of protein function from primary structure are based on similarity between the query sequence and sequences of known function. This approach, however, disregards the occurrence of gene duplication (paralogy) or convergent evolution of the genes. The analysis of correlated proteins that share a common domain, taking into consideration the evolutionary history of gen...
متن کاملESG: extended similarity group method for automated protein function prediction
MOTIVATION Importance of accurate automatic protein function prediction is ever increasing in the face of a large number of newly sequenced genomes and proteomics data that are awaiting biological interpretation. Conventional methods have focused on high sequence similarity-based annotation transfer which relies on the concept of homology. However, many cases have been reported that simple tran...
متن کاملAn improved hypergeometric probability method for identification of functionally linked proteins using phylogenetic profiles
Predicting functions of proteins and alternatively spliced isoforms encoded in a genome is one of the important applications of bioinformatics in the post-genome era. Due to the practical limitation of experimental characterization of all proteins encoded in a genome using biochemical studies, bioinformatics methods provide powerful tools for function annotation and prediction. These methods al...
متن کاملFamily Classification and Integrative Analysis for Protein Functional Annotation
The high-throughput genome projects have resulted in a rapid accumulation of predicted protein sequences, however, experimentally-verified information on protein function lags far behind. The common approach to inferring function of uncharacterized proteins based on sequence similarity to annotated proteins in sequence databases often results in over-identification, underidentification, or even...
متن کاملMolecular phylogeny of three desert truffles from Iran based on ribosomal genome
The ITS region including the 5.8S gene of rDNA of three desert truffle species were amplified using ITS4 and ITS1 primers. The ITS sequences were compared to those of other related authentic sequences obtained from GenBank. Among 12 specimens studied, seven isolates corresponded to Terfezia claveryi reported by other authors. Iranian T. claveryi specimens had an average similarity of 99.4% (ran...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 10 شماره
صفحات -
تاریخ انتشار 2015